Batch Normalization

本质:在网络中任意一层进行归一化处理

Internal Covariate Shift

内部协变量偏移

Internal Covariate Shift is the change in the distribution of network activations due to the change in network parameters during training.

神经网络中间层(隐藏层)在训练过程中,数据分布的改变称之为:“Internal Covariate Shift”

Pasted image 20231007204009.jpg
我们要避免 z 过大/过小,以 sigmoid 为例,z 过大/过小会导致 sigmoid 函数的导数趋近于 0;由于反向传播的 step 大小与激活函数的导数正相关,所以这会导致神经网络整体的 step 越来越小,不利于神经网络的收敛。

Pasted image 20231007204638.jpg

论文:[1502.03167] Batch Normalization: Accelerating Deep Network Training by Reducing Internal Covariate Shift (arxiv.org)
参考博客:Internal Covariate Shift: How Batch Normalization can speed up Neural Network Training | by Jamie Dowat | Analytics Vidhya | Medium